A Concept Language Model for Ad-hoc Retrieval

نویسندگان

  • Bin Zou
  • Vasileios Lampos
  • Shangsong Liang
  • Zhaochun Ren
  • Emine Yilmaz
  • Ingemar J. Cox
چکیده

We propose an extension to language models for information retrieval. Typically, language models estimate the probability of a document generating the query, where the query is considered as a set of independent search terms. We extend this approach by considering the concepts implied by both the query and words in the document. The model combines the probability of the document generating the concept embodied by the query, and the traditional language model probability of the document generating the query terms. We use a word embedding space to express concepts. The similarity between two vectors in this space is estimated using a weighted cosine distance. The weighting significantly enhances the discrimination between vectors. We evaluate our model on benchmark datasets (TREC 6–8) and empirically demonstrate it outperforms state-of-the-art baselines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collaborative Learning of Term-Based Concepts for Automatic Query Expansion

Information Retrieval Systems have been studied in Computer Science for decades. The traditional ad-hoc task is to find all documents relevant for an ad-hoc given query but the accuracy of ad-hoc document retrieval systems has plateaued in recent years. At DFKI, we are working on so-called collaborative information retrieval (CIR) systems which unintrusively learn from their users search proces...

متن کامل

MeSH Based Feedback, Concept Recognition and Stacked Classification for Curation Tasks

This paper reports about experiments carried out in the context of the genomics track at TREC 2004. Experiments were concentrated on two subtasks: the ad hoc retrieval task and the triage task. Experiments for the ad hoc task aimed at improving a standard full-text ad-hoc run (using a language modeling approach) by exploiting the manual classification of MEDLINE abstracts (the MeSH terms) for r...

متن کامل

A Two-Stage Retrieval Model for the TREC-7 Ad Hoc Task

A two-stage model for ad hoc text retrieval is proposed in which recall and precision are maximized sequentially. The rst stage employs query expansion methods using WordNet and on a modi ed stemming algorithm. The second stage incorporates a term proximity-based scoring function and a prototype-based reranking method. The e ectiveness of the two-stage retrieval model is tested on the TREC-7 ad...

متن کامل

Information Retrieval from Large Textbases

Our objective is to enhance the effectiveness of retrieval and routing operations for large scale textbases. Retrieval concerns the processing of ad hoc queries against a static document collection, while muting concerns the processing of static, trained queries against a document stream. Both may be viewed as trying to rank relevant answer documents high in the output. Our text processing and ...

متن کامل

Experiment Report of TREC 2005 Genomics Track ad hoc Retrieval Task

This report describes the experiments we have conducted on the ad hoc retrieval task of Genomics track at TREC 2005. In the experiment, a number of different techniques were employed, including Porter stemming, MeSH term and gene name identification, Okapi, weighting schemes, query expansion, and concept-based ranking strategy. The results on sample topics are reported. Future improvements, suc...

متن کامل

XRCE's Participation in Wikipedia Retrieval, Medical Image Modality Classification and Ad-hoc Retrieval Tasks of ImageCLEF 2010

This year, XRCE participated in three main tasks of ImageCLEF 2010. The Visual Concept Detection and Annotation Task is presented in a separate paper. In this working note, we rather focus on our participation in the Wikipedia Retrieval Task and in two sub-tasks of the Medical Retrieval Task (Image Modality Classification and Ad-hoc Image Retrieval). We investigated mono-modal (textual and visu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017